A COMPUTABILITY CHALLENGE: ASYMPTOTIC BOUNDS 
AND ISOLATED ERROR-CORRECTING CODES 

Yuri I. Manin 

O , Max-Planck-Institut fiir Mathematik, Bonn, Germany 

^ ' Dedicated to Professor C. S. Calude, on his 60th birthday 



(N 

H 
o 



ABSTRACT. Consider the set of all error-correcting block codes over a fixed 
alphabet with q letters. It determines a recursively enumerable set of points in the 
unit square with coordinates (-R, 5):= (relative transmission rate, relative minimal 
distance). Limit points of this set form a closed subset, defined by -R < ag(5), where 
c/5 ! cxq{5) is a continuous decreasing function called asymptotic hound. Its existence was 

proved by the author in 1981, but all attempts to find an explicit formula for it so 
far failed. 

^ ■ In this note I consider the question whether this function is computable in the 

^ . sense of constructive mathematics, and discuss some arguments suggesting that the 

P^ ! answer might be negative. 

^, 
f^~. ■ 1. Introduction. 

o: 

1.1. Notation. This paper is a short survey focusing on an unsolved problem 
of the theory of error-correcting codes (cf. the monograph [VlaNoTsfa] ) . 

Briefly, we choose and fix an integer q > 2 and a finite set, alphabet A, of 
rS I cardinality q. An (unstructured) code C is defined as a non-empty subset C C A"' 

o3 ' of words of length n > 1. Such C determines its code point Pc = {R{C), S{C)) in 

the {R, 5)-plane, where R{C) is called the transmission rate and S{C) is the relative 
minimal distance of the code. They are defined by the formulas 

5(C) := 4§-, d{C) := mm {d{a,b)\a,beC, a ^b}, n{C) := n, 
n{C) 

i?(^) = ^, M^):=log,card(C), (1.1) 

where d{a, b) is the Hamming distance 



d{{ai), (bi)) := card{z G (1, . . . , n) | a^ 7^ 6J. 
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In the degenerate case cardC = 1 we put d{C) = 0. We will call the numbers 

k = k{C), n = n{C), d = d{C), code parameters and refer to C as an [n,k,d]q- 
code. 

A considerable bulk of research in this domain is dedicated either to the con- 
struction of (families of) "good" codes (e. g. algebraic-geometric ones), or to the 
proof that "too good" codes do not exist. A code is good if in a sense it maximizes 
simultaneously the transmission rate and the minimal distance. To be useful in 
applications, a good code must also come with feasible algorithms of encoding and 
decoding. The latter task includes the problem of finding a closest (in Hamming's 
metric) word in C, given an arbitrary word in A^ that can be an output of a noisy 
transmission channel (error correction). Feasible algorithms exist for certain classes 
of structured codes. The simplest and most popular example is that of linear codes: 
A is endowed with a structure of a finite field F^, A^ becomes a linear space over 
Fg, and C is required to be a linear subspace. 

1.2. Asymptotic bounds. Since the demands of good codes are mutually 
conflicting, it is natural to look for the bounds of possible. 

A precise formulation of the notion of good codes can be given in terms of two 
notions: asymptotic hounds and isolated codes. 

Fix q and denote by Vq the set of all points Pc, corresponding to all [n, /c, d\q- 
codes. Define the code domain Uq as the set of limit points ofVq. 

It was proved in [Manl] that Uq consists of all points in [0, 1]^ lying below the 
graph of a certain continuous decreasing function a^: 

Uq = {{R,6)\R<aq{6)}. (1.2) 

Moreover, ag(0) = l,aq{d) = for 1 — q~^ < S < 1, and the graph of aq is tangent 
to the i?-axis at (1, 0) and to the 5-axis at (0, 1 — q~^). 

This curve is called the asymptotic bound. (In fact, [Manl] considered only linear 
codes, and the respective objects are now called Vq^^, Uq^^a^q^] unstructured case 
can be treated in the same way with minimal changes: cf. [ManVla] and [ManMar]). 

Now, a code can be considered a good one, if its point either lies in Uq and is 
close to the asymptotic bound, or is isolated, that is, lies above the asymptotic 
bound. 

1.3. Computability problems. There is an abundant literature establishing 
upper and lower estimates for asymptotic bounds, and providing many isolated 



codes. However, not only "exact formulas" for asymptotic bounds are unknown, 
but even the question, whether aq{d) is differentiable, remains open (of course, since 
this function is monotone and continuous, it is differentiable almost everywhere.) 
Similarly, the structure of the set of isolated code points is a mystery: for example, 
are there points on R — ag(5),0 < i? < 1 — q~^ , that are limit points of isolated 
codes? 

The principle goal of this report is to discuss weaker versions of these problems, 
replacing "exact formulas" by "computability" . In particular, we try to elucidate 
the following 

QUESTION. Is the function ctq{S) computable? 

As our basic model of computability we adopt the one described in [Brat We] and 
further developed in [BratPre] , [Brat] , [BratMiNi] . In its simplest concrete version, 
it involves approximations of closed subsets of R^, such as Uq or graph of ctg, by 
unions of computable sets of rational coordinate squares, "pixels" of varying size. 

The following mental experiment suggests that the answer to this computability 
problem may not be obvious, and that ctg might even be uncomputable and by 
implication not expressible by any reasonable "explicit formula" . 

Imagine that a computer is drawing finite approximations Vq to the set of code 
points Vq by plotting all points with n < N for a large A^ (appropriately matching 
a chosen pixel size). What will we see on the screen? 

Conjecturally, we will not see a dark domain approximating Uq with a cloud of 
isolated points above it, but rather an eroded version of the Varshamov-Gilbert 
curve lying (at least partially) strictly below R = C(q{d): 

R=l{l- 51og^(? - 1) - 51og/ - (1 - 5)log^(l - 6)) (1.3) 

In fact, "most" code points lie "near" (1.3): cf. Exercise 1.3.23 in [VlaNoTsfa] and 
some precise statements in [BaFo] (for q = 2.) 

By contrast, a statistical meaning of the asymptotic bound does not seem to be 
known, and this appears as the intrinsic difficulty for a complete realization of the 
project started in [ManMar]: interpreting asymptotic bound as a "phase transition" 
curve. Hopefully, a solution might be found if we imagine plotting code points in 
the order of their growing Kolmogorov complexity, as was suggested and used in 



[Man3] for renormalization of halting problem. For the context of constructive 
mathematics, cf. [CaHeWa] and references therein. 

In any case, it is clear that code domains represent an interesting testing ground 
for various versions of computability of subsets of R"^, complementing the more 
popular Julia and Mandelbrot fractal sets (cf. [BravC] and [BravYa]). 



2. Code parameters and code points: a summary 

2.1. Constructive worlds of code parameters. Denote the set of all triples 

[n, g^, d] G N"^ corresponding to all (resp. linear) [n, /c, rfj^-codes by Pq (resp. P^*"-). 
Clearly, Pq and Pq"' are infinite decidable subsets of N^. Therefore they admit 
natural recursive and recursively invertible bijections with N ( "admissible number- 
ings"), defined up to composition with any recursive permutation N — > N. Hence 
Pq and Pq" are infinite constructive worlds in the sense of [Man3], Definition 1.2.1. 

If X, Y are two constructive worlds, we can unambiguously define the notions 
of (partial) recursive maps X ^ Y, enumerable and decidable subsets of X, Y, 
X X Y etc., simply pulling them back to the numberings. For a more developed 
categorical formalism, cf. [Man3]. 

2.2. Constructive world S = [0, 1]^ fl Q^. The set of all rational points of 
the unit square in the (i?, 5)-plane also has a canonical structure of a constructive 
world. 

2.3. Enumerable sets of code points. Code points (1.1) of linear codes all 
lie in S. To achieve this for unstructured codes, we will slightly amend (1.1) and 
define the map cp : Pg — )■ S {cp stands for "code point") by 



cp{[n,q\d]):=(^,- (2.1) 

where [k] denotes the integer part of the (generally real) number k. On P'*"^ C Pq 
it coincides with (1.1). 

The motivation for choosing (2.1) is this: in the eventual study of computability 
properties of the graph R = aq{d), it is more transparent to approximate it by 
points with rational coordinates, rather than logarithms. 

Let Vq (resp. Vj*") be the image cp{Pq) (resp. cp{Plj^")) i.e. the respective set 
of code points in S. Since cp is a total recursive function both on Pq and Pq^", Vq 
and Vq^" are recursively enumerable subsets of S. 



2.4. Limit code points. Let Uq (resp. t/^!*"^) be the closed sets of limit points of 

Vq (resp. Vq^'^). We will call limit code points eleiaeiits of VqCiUq (resp. V^'^'^ nU^^^'^) . 
The remaining subset of isolated code points is defined as V^ \ V^ fl t/g, and similarly 
for linear codes. 

Notice that we get one and the same set Uq, using transmission rates (1.1) 
or (2.1). In fact, for any infinite sequence of pairwise distinct code parameters 
[uij q'^^ di], i = 1, 2, ... we have rii — )■ oo, hence the convergence of the sequence of 
code points (1.1) is equivalent to that of (2.1), and they have a common limit. The 
resulting sets of isolated code points differ depending on the adopted definition (1.1) 
or (2.1), however, the set of isolated codes, those whose code points are isolated, 
remains the same. 

Our main result in this section is the following characterization of limit and 
isolated code points in terms of the recursive map cp rather than topology of the 
unit square. 

We will say that a code point x & Vq has infinite (resp. finite) multiplicity, if 
cp~^{x) C Pq is infinite (resp. finite). The same definition applies to Vj*"^ and P^^^ . 

2.5. Theorem, (a) Code points of infinite multiplicity are limit points. There- 
fore isolated code points have finite multiplicity. 

(h) Conversely, any point {Rq,5q) with rational coordinates satisfying the in- 
equality < Ro < Oiq{do) (resp. < Rq < Q;g"'(5o)y) is a code point (resp. linear 
code point) of infinite multiplicity. 

This (actually, a slightly weaker) statement, seemingly, was first stated and 
proved in [ManMar]. It makes me suspect that distinguishing between limit and 
isolated code points might be algorithmically undecidable, since in general it is al- 
gorithmically impossible to decide, whether a given recursive function takes one of 
its values at a finite or infinitely many points. 

Similarly, one cannot expect a priori that limit and isolated code points form 
two recursively enumerable sets, but this must be true, if aq is computable: see 
Theorem 3.3.1 below. 

For completeness, I will reproduce the proof of Theorem 2.5 here. It is based 
on the same "Spoiling Lemma" that underlies the only known proof of existence of 
the asymptotic bounds aq and a^"^. 

2.6. Proposition (Numerical spoiling). If there exists a linear [n,k,d]q-code, 
then there exist also linear codes with the following parameters: 



(i) [n + l,k,d]q (always). 

(a) [n-l,k,d- l]g (if n> l,k > 0.) 

(in) [n — 1, /c — 1, (i]g (if n > 1, k > 1) 

In the domain of unstructured codes statements (i) and (ii) remain true, whereas 
in (Hi) one should replace [n — 1, /c — 1, d]g by [n — 1, /c', d]g for some k — l<k'<k. 

For a proof of Proposition 2.6, see e. g. [VlaNoTsfa] (linear codes) and [ManMar] 
(unstructured codes). 

2.7. Proof of Theorem 2.5. (a) We first check that if a code point {Rq, 5o) G 
Q^ is of infinite multipHcity, then it is a Umit point. In fact, let [ni^q^\di\ be an 
infinite sequence of pairwise distinct code parameters, i > 1, such that [ki\/ni = 
Ro^di/ui = do for all i. Then codes with parameters [n^ + l,g^*,(ii] (cf. 2.6 (i)) 
produce infinitely many pairwise distinct code points converging to {Rq, Sq). 

(b) Now consider a rational point {Rq, Sq) G Q^n(0, 1)^ (unstructured or linear), 
lying strictly below the respective asymptotic bound. Then there exists a code 
point (i?i,5i) also lying strictly below the asymptotic bound, with Ri > Rq and 
di > 5o, because functions a^ and a^" decrease. Hence in the part of Uq (resp. 
jjhn^ where R > Ri,d > di) there exists an infinite family of pairwise distinct 
code points (Ri.di), i > 1, coming from a family of unstructured (resp. linear) 
[Ni, Ki, Di]q-codes. 

Let (Ro^do) = {k/n.d/n). Divide A^^ by n with a remainder term, i.e. put 
Ni = {ai — l)n + ri, a^ > 1, < r^ < n. Using repeatedly 2.6 (i), spoil the respective 
[Ni, Ki, L>i]g-code, replacing it by some [aiU^ Ki, Di]q-code. Its code point will have 
slightly smaller coordinates than the initial {Ri,di), however for A^^ large enough, 
it will remain in the domain R > Ro,d > Sq. Hence we may and will assume from 
the start that in our sequence of [A^^, K^, Di]q-codes all A^^'s are divisible by n: 

Ni = aiU . (2.2) 

In order to derive by spoiling from this sequence another sequence of pairwise 
distinct codes, all of which have one and the same code point (Rq, Sq) = {k/n, d/n), 
we will first consider the case of linear codes where the procedure is neater, because 
[Ki] = Ki. Since we have Ki/Ni > k/n, Di/Ni > d/n, we get 

Ki > aikj Di > aid. 



To complete the proof, it remains to reduce the parameters Ki, Di to aik, aid respec- 
tively, without reducing Ni = ain. In the linear case, this is achieved by application 
of several steps 2.6 (ii), 2.6 (iii), followed by steps 2.6 (i). 

In the unstructured case reducing Di can be done in the same way. It remains 
to reduce [Ki] to aik. One application of the step 2.6 (iii) produces K[ such that 
either [K'^ = [Ki] — 1, or [K^] = [Ki]. In the latter case, after restoring A^^ to its 
former value, one must apply 2.6 (iii) again. After a finite number of such substeps, 
we will finally get [Ki] — 1. 

2.8. Question. Can one find a recursive function h{n, /c, d, q) such that if an 
[n,k,d]q-code is isolated, and a > h{n,k,d,q), there is no code with parameters 
[an^ akj ad] 
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3. Codes and computability 

In this section, I will discuss computability of two types of closed sets in [0, 1]^: 
Uq and ^q'-= the graph of ctg, as well as their versions for linear codes. I will start 
with the brief summary of basic definitions of [Brat We] in our context. 

3.1. Effective closed sets. First, we will consider [0, 1]^, Uq and Tq as closed 
subsets in a larger square, say X := [—1,2]^, with its structure of compact metric 
space given by d{{ai), (bi)) := max|ai — bi\. The set of open balls B with rational 
centers and radii in this space has a natural structure of a constructive world (cf. 
2.1). Hence we may speak about (recursively) enumerable and decidable subsets of 
B. 

Following [Brat We] and [La] , we will consider three types of effectivity of closed 
subsets Y C X: 

(i) Y is called recursively enumerable, if the subset 

{leB\lnY y^dljcB (3.1) 

is recursively enumerable in B. 

(ii) Y is called co -recursively enumerable, if the subset 

{I eB\7r\Y = (ls}cB (3.2) 

is recursively enumerable in B (here / is the closure of /) . 



(iii) Y is called recursive, if it is simultaneously recursively enumerable and co- 
recursively enumerable. 

As a direct application of [Br at We] we find: 

3.2. Proposition. The closures Vg and V ^ are recursively enumerable. 

Proof. In fact, range of the function cp (see 2.3) is dense in V q, resp. V^ , and 
we can apply [BratWe], Corollary 3.13(l)(d). 

3.3. Problem of computability of the asymptotic bound. Referring to 

t^*"") computable, if its graph Vq 



the Corollary 7.3 of [Brat], we will call aq (resp. ct'*"") computable, if its graph Vq 



(resp. r'*"^) is co-recursively enumerable. 

3.3.1. Theorem. Assume that aq is computable. Then each of the following 
sets is recursively enumerable: 

(a) Code points lying strictly below the asymptotic bound. 

(b) Isolated code points. 

The same is true for linear codes, if aq"^ is computable. 

Proof. We start with the following remark. Choose any integer A^ > 1 and 
consider the set Tq which is the union of closed balls of the form 



p p + 1 

iV' N 



X 



p p+1 

iV' N 



C X (3.3) 



satisfying p G N, / fl Fg 7^ 0. Then we have: 

(i) The boundary ofVq consists of two vertical (parallel to the R-axis) segments 
at the ends and two piecewise linear connected closed curves: r^_,_ lying above Vq- ■ 

(a) The distance of any point x G Tg- to V\^^ does not exceed 2/N, and simi- 
larly with + and — reversed. 

Let us call an N-strip any connected closed set satisfying these conditions. 

Now, assuming aq (resp. a^"^) computable, that is, Tq co-recursively enumer- 
able, choose N and run the algorithm generating in some order all rational closed 
balls / such that / fl F^ = 0. Wait until their subset consisting of balls of the form 
(3.3) covers the whole square [0,1]^ with exception of a set whose closure is an 



A^-strip. This strip will then be an approximation to Tq (resp. r^"^) containing the 
respective graph in the subset of its inner points. 

Run parallelly an algorithm generating all code points and divide each partial 
list of code points into three parts depending on A^: points lying below Tq , above 
Tq , and inside Tq . 

When N grows, the growing first and second parts respectively will recursively 
enumerate code points below and above the asymptotic bound. 

Remark. This reasoning also shows, in accordance with [Brat], that if we 
assume Tq only co-recursively enumerable, it will be automatically recursively enu- 
merable and therefore recursive. 

3.4. Theorem. Assume that Uq is recursive in the sense of 3.1 (Hi). Then aq 
is computable. The similar statement holds for linear codes. 

Proof. Consider first a closed ball / as in (3.3) that intersects Uq whereas its 
inner part / does not intersect Uq. A contemplation will convince the reader that 
the left lower boundary point of this "ball" (a square in the Euclidean metric) is 
precisely the intersection point iHTq. Call such a ball an exceptional N-ball. Since 
aq is decreasing, we have 

(a) Each horizontal strip p/N < R < {p + l)/N and each vertical strip q/N < 
S < {q + 1)/-/V can contain no more than one exceptional N-ball. 

(b) If one exceptional N-ball lies to the right of another one, then it also lies 
lower than that one. 

Generally, call a set of A^-balls N -admissible, if it satisfies (a) and (b). 

Now, assuming Uq recursive and having chosen A^, we can run parallelly two 
algorithms: one generating closed balls (3.3) non-intersecting Uq and another, gen- 
erating open balls (3.3) intersecting Uq. Run them until all A^-balls are generated, 
with a possible exception of an A^-admissible subset X\ , then stop generation. 
Let t/g_|_ be the union of all balls generated by the first algorithm, and U)._ the 
union of all balls generated by the second algorithm. 

Look through all the balls in X\ in turn. If there are elements in it whose 
closure does not intersect the closure of Uq_ , delete them from Xq and put it 
into Uqj^ . Similarly, if there are elements in it whose closure does not intersect 
(initial) t/g_|_ , delete them from Xq and put them into UJj_ . 
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Keep the old notations U^_ , U^_^_\ Xq for these amended sets. 

Now, the union of the lower boundary of U^_^_ and the upper boundary of U^_ 
will approximate Tg from two sides, with error not exceeding A^~^. (Here a "bound- 
ary" means the respective set of boundary squares). 

Clearly, this reasoning shows also also computability of a^ in the sense of 3.3. 
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